The influence of time on the sensitivity of SARS-CoV-2 serological testing

Sensitive serological testing is essential to estimate the proportion of the population exposed or infected with SARS-CoV-2, to guide booster vaccination and to select patients for treatment with anti-SARS-CoV-2 antibodies. The performance of serological tests is usually evaluated at 14–21 days post infection. This approach fails to take account of the important effect of time on test performance after infection or exposure has occurred. We performed parallel serological testing using 4 widely used assays (a multiplexed SARS-CoV-2 Nucleoprotein (N), Spike (S) and Receptor Binding Domain assay from Meso Scale Discovery (MSD), the Roche Elecsys-Nucleoprotein (Roche-N) and Spike (Roche-S) assays and the Abbott Nucleoprotein assay (Abbott-N) on serial positive monthly samples collected as part of the Co-STARs study (www.clinicaltrials.gov, NCT04380896) up to 200 days following infection. Our findings demonstrate the considerable effect of time since symptom onset on the diagnostic sensitivity of different assays. Using a time-to-event analysis, we demonstrated that 50% of the Abbott nucleoprotein assays will give a negative result after 175 days (median survival time 95% CI 168–185 days), compared to the better performance over time of the Roche Elecsys nucleoprotein assay (93% survival probability at 200 days, 95% CI 88–97%). Assays targeting the spike protein showed a lower decline over the follow-up period, both for the MSD spike assay (97% survival probability at 200 days, 95% CI 95–99%) and the Roche Elecsys spike assay (95% survival probability at 200 days, 95% CI 93–97%). The best performing quantitative Roche Elecsys Spike assay showed no evidence of waning Spike antibody titers over the 200-day time course of the study. We have shown that compared to other assays evaluated, the Abbott-N assay fails to detect SARS-CoV-2 antibodies as time passes since infection. In contrast the Roche Elecsys Spike Assay and the MSD assay maintained a high sensitivity for the 200-day duration of the study. These limitations of the Abbott assay should be considered when quantifying the immune correlates of protection or the need for SARS-CoV-2 antibody therapy. The high levels of maintained detectable neutralizing spike antibody titers identified by the quantitative Roche Elecsys assay is encouraging and provides further evidence in support of long-lasting SARS-CoV-2 protection following natural infection.

www.nature.com/scientificreports/ vaccination and identify candidates for SARS-CoV-2 antibody therapy. The rapid response to the COVID-19 pandemic has led to the development of a wide range of serological tests suitable for evaluating SARS-CoV-2 exposure, infection or vaccination status [1][2][3] . Typically, these tests are approved for use by the regulatory authorities based on their performance against a panel of reference sera including positive and negative controls at either 14-or 21-days post infection 4 . Public Health England reported a 93.9% sensitivity for the Abbott SARS-CoV-2 IgG Nucleoprotein assay 5 and 100% for the Roche Elecsys Nucleoprotein assay at ≥ 14 days post infection 6 . This led to widespread adoption of these tests across NHS laboratories for testing at population level. Other studies have confirmed this test performance at 14-21 days post infection 7,8 . Population level serological studies have also based their conclusions-vital to guide national policy-on the basis of these tests 9 without considering how time since infection influences the performance of the test. The problem with this approach is that it does not take into account SARS-CoV-2 humoral dynamics and changes in avidity over time 10,11 . Although serological tests with limited diagnostic range may demonstrate excellent sensitivity shortly after infection, it is unclear how they will perform with time following infection or vaccination.
In order to address this question, we applied 4 widely used serological assays in parallel to serial samples from the Co-STARs study 12 in which staff testing seropositive to SARS-CoV-2 were followed for up to 200 days following infection. We compared the proportion of samples that remained seropositive over time using a survival analysis and determined the decay rate of the nucleoprotein (N) antibody and the spike (S) antibody for each test using a previously published mathematical model fitted to the data.

Materials and methods
Study setting and design. Serological testing was performed on stored serum samples collected as part of the Co-STARs study (www. clini caltr ials. gov, NCT04380896), approved by the UK National Health Service Health Research Authority and run at Great Ormond Street Hospital between April 29th and November 2020 in accordance with the relevant guidelines 10 . Briefly, Co-STARs was a 1-year single-centre prospective cohort study of antibody responses to COVID-19 infection in healthcare workers. Serum samples were taken from the 3657 participants at baseline and underwent a screening ELISA using the EDI assay. Repeated monthly serum samples were then taken from those with a seropositive baseline screening test for up to 250 days after the date of infection. Written informed consent was obtained from all participants. Those samples identified as seropositive with available symptom start date had further confirmatory testing with the quantitative three antigen MSD assay. Study participants. The majority of hospital staff were eligible for the Co-STARS study 12 . Only those participants with significant immunosuppression, those that had received blood products within 6 months of recruitment and those that had active and ongoing symptoms of SARS-CoV-2 infection (within the last 21 days) were excluded. Only samples from individuals with at least one positive test from any platform were included in the analysis. Moreover, individuals without a known symptom start date were removed.

Data collection.
As part of the Co-STARs study all participants undertook a detailed standardised online questionnaire at study entry 12 . This included the date of onset of COVID-19 symptoms, and any SARS-CoV-2 diagnostic test results.
Comparison of serological assays. Samples taken as part of the Co-STARS study 12 which had an accompanying symptom start date available for analysis were initially screened for seropositivity by the EDI assay or by any of the three antigens of the Meso Scale Discovery (MSD) assay. The selected samples each underwent testing with 4 serological assays: (1) The Roche Elecsys Anti-SARS-CoV-2 electrochemiluminescence immunoassay (ECLIA) assay detects the nucleocapsid (N) antigen (Roche-N); (2) the Roche Elecsys Anti-SARS-CoV-2 S electrochemiluminescence immunoassay (ECLIA) assay detects the spike (S) antigen (Roche-S); (3) the Abbott Nucleoprotein Chemiluminescent Microparticle Immunoassay (CLMIA) assay detects the nucleocapsid (N) antigen (Abbott-N); (4). All tests were performed as per manufacturer's specifications.The four antigen Meso Scale Discovery (MSD) assay was undertaken at the WHO Pneumococcal Supranational Reference Laboratory at the UCL Institute of Child Health. Only 3 antigens were reported from the MSD assay (the Spike, the Nucleoprotein and the RBD) as the baseline test performance of the N-terminal domain (NTD) antibody response was insufficient for further evaluation as previously reported 13 . The Roche-N and Roche-S assays were undertaken by the Laboratory Medicine Service of Swansea Bay University Health Board, Morriston Hospital, Swansea. The Abbott-N assay was undertaken by Public Health Wales Microbiology at Cardiff and Vale University Hospital. All samples were stored and transported between laboratories at − 80 °C and only removed for aliquoting prior to testing to avoid unnecessary freeze-thaw cycles.
Statistical analysis and modelling. In order to evaluate the relative proportion of seropositive tests in the parallel serological assays over time, a time-to-event analysis was performed using the time from symptom onset and the first negative test for each assay after a first positive test as the event of interest using the R package survival 14,15 . Only tests taken > 14 days after symptom onset were considered in the analysisNo tests were performed between 14-and 21-days post symptoms, and thus using 14 or 21 days post symptom onset as threshold did not affect our results. A participant was defined as seropositive when at least one of the 4 tests undertaken was seropositive. If the other tests that were run in parallel never became seropositive, the time-to-event was set to the earliest test taken for that individual. If a participant never became seronegative during the follow-up period, a right-censored observation was added at the time of the last serological test. www.nature.com/scientificreports/ Additionally, the decay rate after 21 days since symptom onset was estimated using a Bayesian generalized linear mixed model as implemented in the R package MCMCglmm 16 , where time from symptom onset was included as a fixed effect and study participants as a random effect. Therefore, a unique slope for the regression was estimated for the entire population, while the intercept was allowed to vary between the study participants. The decay rate was estimated from the slope of the linear model.
To assess the overall diagnostic capability of the Abbott-N assay, a receiver operating characteristic curve (ROC) analysis was performed using the pROC package within R 17 . The MSD-N and Roche-N assays were used as the gold standard for the comparison.
Ethical approval and consent. The study had national Integrated Research Application System (IRAS) approval and all participants in the study provided informed consent.

Results
A total of 950 samples from 329 participants seropositive by any assay after 14 days underwent testing with the Roche-N, Roche-S, the MSD and the Abbott-N assay. The majority of the participants (98%, 321/329) had a positive result by two or more assays. Antibody decay with time. Plotting the raw log transformed antibody titers over time since symptom onset (Fig. 1) demonstrated that antibody dynamics were dependent on the assay undertaken. The production of spike antibodies was demonstrated to be maintained at high levels up to 200 days when evaluated by the MSD and the quantitative Roche -S assay. All nucleoprotein antibody assays demonstrated decay of the nucleoprotein antibody over time. This was most pronounced in the Abbott-N assay and much less so in the Roche -N assay which demonstrated slow waning of the nucleoprotein antibody.
Assay sensitivity with time post symptom onset. The existing published test performance for all assays undertaken is provided in Table 1. The sensitivity of all assays (at least 14 days from symptom onset) at 50, 100 and 150 days is provided in Table 2. All assays demonstrated a reasonable sensitivity at 50 days following infection (Fig. 2a). As time passed following infection, the Abbott-N assay rapidly became seronegative (Fig. 2a), with a median survival time inferred at 175 days (95% CI 168-185 days), whereas the survival probability at 150 days was inferred to be 95% for the Roche-N (95% CI 0.92-0.97), and 91% for the MSD-N assay (95% CI 0.87-0.94). The Roche-S and MSD-S assays remained seropositive for the duration of the study. The MSD-RBD assay showed some evidence of waning seropositivity over time (90% Survival probability at 150 days, 95% CI 0.88-0.94).
A total of 45% (159/329) of the individuals had a negative result using the Abbott-N assay during the course of the study. For the MSD test, 16% (52/329) of participants had a negative test for the N antigen, 11% (36/329) Figure 1. Log transformed serial serological antibody titer data plotted by time from symptom onset. Antibody dynamics are dependent on the assay used with the sensitive Roche-S and MSD-S assay demonstrating maintenance of the spike protein antibody while the nucleoprotein antibody is shown to wane with the MSD and Abbott-N assays but to a lesser extent with the Roche-N assay.    Mathematical model fits to estimate antibody decay. To estimate the decay rate for each antibody and assay studied, a generalized linear mixed model was fitted to the trajectory of antibody decay after 21 days from symptom onset, where the decay rate was estimated as the slope of the antibody titer through time. Under the most sensitive and quantitative Roche -S assay the spike antibody demonstrated no decay at all and rather a slow rate of increased titers over time from symptom onset (0.0031, 95% CI 0.0018-0.0044, Fig. 2b). In accordance with the raw observed data, all nucleoprotein antibodies under the mathematical model decayed. This was most pronounced in the Abbott-N assay (− 0.022, 95% CI − 0.023 to − 0.02) and least pronounced in the Roche -N assay (− 0.0025, 95% CI − 0.0039 to − 0.0012, Fig. 2b, Table 3). The lower performance of the Abbott-N assay can be explained by a lower detection of titer values as their concentration wanes over time. When compared to the quantitative MSD-N, 26% (222/860) of all positive samples by the MSD-N were negative for the Abbott-N test (Fig. 3a). A total of 75% of samples (137/ 183) positive by the MSD-N with an MSD arbitrary titer value lower than 403 were negative for the Abbott-N assay. Using the currently manufacturer recommended threshold of 1.4 arbitrary units, the Abbott-N test was characterized by a high specificity of 0.96 and a sensitivity of 0.74 using all our test results after 14 days. Using a ROC curve (Fig. 3b), the optimal cut-off that maximises both specificity and sensitivity was estimated to be 0.845 arbitrary units.

Discussion
Sensitive measurement of SARS-CoV-2 seropositivity is key to evaluate who has been infected or exposed to SARS-CoV-2, to determine the correlates of protection from future disease, stratify those that need booster vaccination and target the use of anti-SARS-CoV-2 antibodies to those that are seronegative. To our knowledge no other study has evaluated the sensitivity of multiple diagnostic tests in parallel on longitudinally collected serological samples. This study demonstrates that as time elapses after infection, the sensitivity of serological Table 3. Decay rate for each serological assay (log arbitrary units per day) estimated in a generalized linear mixed model. www.nature.com/scientificreports/ testing varies widely depending on the test used. Although serological tests may be demonstrated to perform well 14-21 days after infection, this initial test performance often diminishes as time passes. In order to evaluate whether or not the population maintains SARS-CoV-2 antibodies it is vital that we utilize serological tests that remain sensitive over time. Initial published baseline test performance reports concluded that the Abbott-N assay was a high-performance test and a key tool in SARS-CoV-2 surveillance 18 . Our data demonstrate that as time passes following infection the sensitivity of this assay declines rapidly until at < 6 months following infection it is no more than 50% sensitive. Our findings support the concerns raised by others regarding the poor performance of some nucleoprotein based assays 21,22 .
In contrast, the Roche assays, particularly the Roche Elecsys Anti-SARS-CoV-2 Spike assay maintained high sensitivity for the 200-day duration of the study. Although there remains no single correlate of sterilizing or protective immunity following SARS-CoV-2 infection or vaccination, it is clear that natural infection and the presence of neutralizing spike antibodies decreases the possibility of re-infection and the severity of disease upon re-exposure to currently circulating strains 23 . Our finding that spike antibodies remained at high titers 200 days after infection adds to our previous study on this topic 10 and provides further evidence in support of long-lasting protection against severe disease from currently circulating strains. Fitting mathematical models to the raw data of the Roche spike assay demonstrated that spike antibody titers did not decay but rather increased slightly over the duration of the study. The Roche nucleoprotein assay also maintained sensitivity for the duration of the study with a low rate of decay. Although this assay is semi-quantitative, our findings suggest that this could be used to sensitively identify those that have been vaccinated from those that have been both vaccinated and infected.
Many studies have evaluated the impact of time on test sensitivity over the first 3 weeks following symptom onset [24][25][26] . However, we found no other study that had examined the sensitivity of antibody testing on parallel longitudinal samples collected between 1 and 6 months after infection or exposure. Assays with a higher titer cut-off for detection may perform well in the initial period after infection, but fail to detect seropositivity as antibody levels wane over time. We show that the Abbott-N test failed to detect 75% of samples positive for the MSD-N with a titer value lower than 403, which makes the Abbott-N assay less suitable for seroprevalence studies. Using a ROC curve and the MSD-N and Roche-N assays as the gold standard, we showed that a lower threshold of 0.845 instead of 1.4 arbitrary units may be more suitable to optimize the sensitivity and specificity. Even though different thresholds may be relevant depending on whether sensitivity or specificity needs to be prioritized, our findings suggest that the high Abbott-N test threshold results in a high number of false negatives. These findings are concordant with previous reports showing a range of high uncertainty between 0.49 and 1.4 27 . Barzin et al. 28 used Abbott-N testing alone to determine SARS-CoV-2 seroprevalence in 2,973 asymptomatic out-patients in North Carolina estimating a seroprevalence of 0.8%. Similarly, Wilkins et al. 29 used Abbott-N on 6510 healthcare workers up to 150 days after symptom onset and estimated a seroprevalence of 4.8%. Our findings suggest that previously published surveys of SARS-CoV-2 seroprevalence such as these could have significantly underestimated the true prevalence of SARS-CoV-2 humoral immunity.
Memory T-cell interferon gamma release or proliferation assays in response to SARS-CoV-2 antigens provide an alternative means of assessing prior exposure to infection. However, these assays are limited by cross reactive immunity to the seasonal coronaviruses decreasing specificity 30,31 .
Although all serological tests used in the study demonstrated a high initial specificity, one limitation is that only 38% of participants had a confirmatory SARS-CoV-2 PCR result. Our data may therefore be influenced by an unknown proportion of falsely positive serological tests. However, at entry to the study, all seropositive participants had both a screening EDI nucleoprotein assay and an MSD assay performed which limited the chances of a falsely positive result due to a single erroneous test. Not all samples were processed at the same time; the Roche and Abbott-N assays were processed 3 months after the MSD assays. Despite this, we believe that sample storage and freeze-thawing cycles are unlikely to have influenced our findings as the Roche quantitative spike assay was performed last and demonstrated the highest prolonged levels of spike antibody of all tests used.
In summary, although serological tests may demonstrate high sensitivity 3-weeks after SARS-CoV-2 infection, this is far from the case with some tests 6-months after infection. The Abbott-N assay performed poorly at this time, whereas the Roche and MSD tests maintained a high sensitivity for the 200 days of the study. Tests that perform poorly over time will lead to spurious estimates in population level seroprevalence studies and findings from these studies should be adjusted to account for sensitivity of the test used and the time since infection. Test performance as time passes post infection should be considered before evaluating who is a candidate for booster vaccination or anti-SARS-CoV-2 antibody therapy.